Greedy Algorithm
Greedy algorithm for NP-hard problems has performance guarantee under certain conditions. It achieves optimum for many other problems.
NP-Hard Problem
Set Cover Problem: Find a subset $S \subset X$ with the highest score, $ f(S) $, where: $$ f: 2^{X} \to \mathbb{R} $$ Do we have to check all subsets to find the best one?
Theorem
- The greedy algorithm starts with $ S = \emptyset $. It picks one element $ x $ at each step, where:
- If $ f $ subjects to submodularity and monotonicity, then the algorithm guarantees $ (1 - \frac{1}{e}) $ of the optimum.
Proof
Suppose $S_i$ is the set chosen by greedy algorithm at step $i$.
Suppose $S^{*} = \{x_1^{*}, \ldots, x_k^{*} \}$ is the optimal solution of size $k$.
Then:
$$ \begin{aligned} f(S^{*}) &\leq f(S^{*} \cup S_i) \quad \text{[monotonicity]} \\ &= f(\{x_1^{*}, \ldots, x_{k}^{*} \} \cup S_i) \\ &= f(\{x_1^{*}, \ldots, x_{k}^{*} \} \cup S_i) - f(\{x_1^{*}, \ldots, x_{k - 1}^{*} \} \cup S_i) + f(\{x_1^{*}, \ldots, x_{k - 1}^{*} \} \cup S_i) \\ &= \cdots \\ &= \sum_{j = 0}^{k} \left[ f(\{x_1^{*}, \ldots, x_{j}^{*} \} \cup S_i) - f(\{x_1^{*}, \ldots, x_{j - 1}^{*} \} \cup S_i) \right] + f(S_i) \\ &\leq \sum_{x^{*} \in S{^*}} \left[ f(\{x^{*}\} \cup S_i) - f(S_i) \right] + f(S_i) \quad \text{[submodularity]} \\ &\leq k \cdot \left[ f(S_{i+1}) - f(S_i) \right] + f(S_i) \quad \text{[greedy choice at step $i$]} \\ \end{aligned} $$Accordingly: ($ k $: target size; $ l $: steps taken)
$$ \begin{aligned} f(S^{*}) - f(S_i) &\leq k \cdot \left[ f(S_{i+1}) - f(S_i) \right] \\ &\Updownarrow \\ \alpha_i &\leq k \cdot \left[ \alpha_{i} - \alpha_{i+1} \right] \\ \alpha_{i+1} &\leq (1 - \frac{1}{k}) \alpha_i \\ \alpha_{l} &\leq (1 - \frac{1}{k})^{l} \cdot \alpha_0 \\ &\leq (1 - \frac{1}{k})^{l} \cdot f(S^{*}) \\ &\Updownarrow \\ f(S^{*}) - f(S_l) &\leq (1 - \frac{1}{k})^{l} \cdot f(S^{*}) \\ &\leq e^{-\frac{l}{k}} \cdot f(S^{*}) \\ f(S_l) &\geq (1- e^{-\frac{l}{k}}) \cdot f(S^{*}) \end{aligned} $$Result Analysis:
- When $l = k$, greedy algorithm gives 63% approximation guarantee:
- When $l = 2k$, greedy algorithm gives 86% approximation guarantee:
- This means we can use more steps to assure better approximation.